Module 5 - Generative Classification Models and Imbalanced Data
Overview
During this week, we continue our exploration of classification models by introducing generative models such as linear discriminant analysis, and generalized linear models such as Poisson Regression, which enable more refined modeling of probabilities of discrete classes or count data. We also introduce an issue commonly encountered in real world data called class imbalance, which occurs when some classes occur at much lower frequency compared to the primary classes. Class imabalance can cause major challenges for modeling and we discuss some computational techniques that allow us to artificially “balance” our data.
Lab 3 is due at the end of the week, as is your your group project proposal.
Learning Objectives
- Techniques for dealing with imbalanced data: SMOTE, under- and over-sampling
- Linear Discriminant Analysis
- Genearlized Linear Models and Poisson Regression
Readings
- ISLP (Introduction to Statistical Learning): Section 4.4-4.6